Acquisition of Large Scale Categorial Grammar Lexicons

نویسندگان

  • Stephen Watkinson
  • Suresh Manandhar
چکیده

A system is presented for inducing Categorial Grammar (CG) lexicons for natural language from either unannotated or minimally annotated corpora extracted from the Penn Treebank. A combination of symbolic and stochastic methods have been used to build a computationally e ective and psychologically plausible system, which learns linguistically useful lexicons. There are a variety of parameters in the system, including the corpus annotation used, the knowledge given to the learner and the weight given to the symbolic and stochastic methods. We present results from a set of experiments that investigate these parameters. The results also show that the system performs well even when compared with systems used for simpler problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PhD Proposal – The Lexicon in Combinatory Categorial Grammar: An Explanatory Theory of Verbal Categories in Natural Languages

The aim of this project is to elaborate a theory of natural language lexicons for Combinatory Categorial Grammar (CCG), a mildly contextsensitive, polynomially time-parsable variant of categorial grammar. This theory will have both a descriptive aspect, exploring the use of appropriate formal machinery for expressing lexical generalisations, and an explanatory aspect, accounting for observed pa...

متن کامل

An inheritance-based theory of the lexicon in combinatory categorial grammar

This thesis proposes an extended version of the Combinatory Categorial Grammar (CCG) formalism, with the following features: 1. grammars incorporate inheritance hierarchies of lexical types, defined over a simple, feature-based constraint language 2. CCG lexicons are, or at least can be, functions from forms to these lexical types This formalism, which I refer to as ‘inheritance-driven’ CCG (I-...

متن کامل

Learning Compact Lexicons for CCG Semantic Parsing

We present methods to control the lexicon size when learning a Combinatory Categorial Grammar semantic parser. Existing methods incrementally expand the lexicon by greedily adding entries, considering a single training datapoint at a time. We propose using corpus-level statistics for lexicon learning decisions. We introduce voting to globally consider adding entries to the lexicon, and pruning ...

متن کامل

An HDP Model for Inducing Combinatory Categorial Grammars

We introduce a novel nonparametric Bayesian model for the induction of Combinatory Categorial Grammars from POS-tagged text. It achieves state of the art performance on a number of languages, and induces linguistically plausible lexicons.

متن کامل

Semantic Bootstrapping of Type-Logical Grammar

A procedure is described which induces type-logical grammar lexicons from sentences annotated with skeletal terms of the simply typed lambda calculus. A generalized formulae-as-types correspondence is exploited to obtain all the typelogical proofs of the sample sentences from their lambda terms, and the resulting lexicons are then optimally unified, which effectively unifies the syntactic categ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001